EC2にApptainerをインストールしてTrinityを実行してみた
こんにちは!コンサル部のinomaso(@inomasosan)です。
今回はEC2にインストールしたApptainer上でTrinityのコンテナを実行できるかを試してみました。
Apptainerとは?
Apptainerの概要やEC2へのインストール方法は、以下ブログをご参照ください。
Trinityとは?
Trinityはゲノムアセンブリのソフトウェアの一つで、リファレンスゲノムを作成することができます。
次世代シーケンサー(NGS)から得られたリードをつなぎ合わせて元のゲノム配列に復元します。
ゲノムアセンブリの詳細を知りたい方は以下の記事がおすすめです。
やってみた
今回は以下のGitHubを参考に、単一のEC2にApptainerをインストールして、Trinityコンテナを起動してみました。
EC2の構築とApptainerをインストールできていることが前提となります。
検証環境
今回構築した環境は以下の通りです。
項目 | バージョン |
---|---|
OS | Ubuntu Server 22.04 LTS |
AMI | ubuntu/images/hvm-ssd/ubuntu-jammy-22.04-amd64-server-20230516 |
インスタンスタイプ | c6a.large |
ストレージ | 30 GiB |
アセンブル用のサンプルファイルをダウンロード
SSMセッションマネージャーで接続した場合、デフォルトの作業ディレクトリは/var/snap/amazon-ssm-agent/xxxx
配下となります。
そのため、ユーザのホームディレクリに移動してからサンプルファイルをダウンロードします。
$ sudo su - ssm-user
サンプルファイルはDocker用のテストデータがGitHub上にあったので、そちらをダウンロードしてきます。
$ wget https://github.com/trinityrnaseq/trinityrnaseq/raw/master/Docker/test_data/reads.left.fq.gz $ wget https://github.com/trinityrnaseq/trinityrnaseq/raw/master/Docker/test_data/reads.right.fq.gz
正常にダウンロードできたかを念の為確認します。
$ ls -la
Apptainer上でTrinityコンテナを実行
GitHubのSingularityのコマンドを参考に実行してみました。
ApptainerはDocker(OCI)イメージと互換性があり、起動時にOCI形式からSIF形式への変換を自動で実施してくれるため、今回の検証ではDocker Hubのイメージを使用しています。
$ apptainer run docker://trinityrnaseq/trinityrnaseq:latest Trinity \ --seqType fq \ --left `pwd`/reads.left.fq.gz \ --right `pwd`/reads.right.fq.gz \ --NO_SEQTK \ --max_memory 1G --CPU 4 \ --output `pwd`/trinity_out_dir
また、Apptainerは、コンテナの実行時にデフォルトでホストサーバの/home/$USER
、 /tmp
、$PWD
をコンテナにマウントしてくれます。
Dockerのように明示的にボリュームマウントしなくても、ホストサーバのファイルを利用可能です。
Trinityが完了するとカレントディレクトリにtrinity_out_dir.Trinity.fasta
が出力されます、
$ ls -la total 2660 drwxr-x--- 5 ssm-user ssm-user 4096 May 30 06:08 . drwxr-xr-x 4 root root 4096 May 30 05:54 .. drwx------ 3 ssm-user ssm-user 4096 May 30 05:54 .apptainer -rw-r--r-- 1 ssm-user ssm-user 220 Jan 6 2022 .bash_logout -rw-r--r-- 1 ssm-user ssm-user 3771 Jan 6 2022 .bashrc drwx------ 3 ssm-user ssm-user 4096 May 30 05:55 .local -rw-r--r-- 1 ssm-user ssm-user 807 Jan 6 2022 .profile -rw-rw-r-- 1 ssm-user ssm-user 215 May 30 05:54 .wget-hsts -rw-rw-r-- 1 ssm-user ssm-user 1251148 May 30 05:54 reads.left.fq.gz -rw-rw-r-- 1 ssm-user ssm-user 1272939 May 30 05:54 reads.right.fq.gz drwxrwxr-x 8 ssm-user ssm-user 4096 May 30 06:08 trinity_out_dir -rw-rw-r-- 1 ssm-user ssm-user 151717 May 30 06:08 trinity_out_dir.Trinity.fasta -rw-rw-r-- 1 ssm-user ssm-user 2950 May 30 06:08 trinity_out_dir.Trinity.fasta.gene_trans_map
実行結果のログが気になる方は、以下のセクションを展開してみてください。
Trinity実行結果
______ ____ ____ ____ ____ ______ __ __ | || \ | || \ | || || | | | || D ) | | | _ | | | | || | | |_| |_|| / | | | | | | | |_| |_|| ~ | | | | \ | | | | | | | | | |___, | | | | . \ | | | | | | | | | | | |__| |__|\_||____||__|__||____| |__| |____/ Trinity-v2.15.1 Left read files: $VAR1 = [ '/home/ssm-user/reads.left.fq.gz' ]; Right read files: $VAR1 = [ '/home/ssm-user/reads.right.fq.gz' ]; Trinity version: Trinity-v2.15.1 -currently using the latest production release of Trinity. Tuesday, May 30, 2023: 06:07:21 CMD: java -Xmx64m -XX:ParallelGCThreads=2 -jar /usr/local/bin/util/support_scripts/ExitTester.jar 0 Tuesday, May 30, 2023: 06:07:22 CMD: java -Xmx4g -XX:ParallelGCThreads=2 -jar /usr/local/bin/util/support_scripts/ExitTester.jar 1 Tuesday, May 30, 2023: 06:07:22 CMD: mkdir -p /home/ssm-user/trinity_out_dir Tuesday, May 30, 2023: 06:07:22 CMD: mkdir -p /home/ssm-user/trinity_out_dir/chrysalis ---------------------------------------------------------------------------------- -------------- Trinity Phase 1: Clustering of RNA-Seq Reads --------------------- ---------------------------------------------------------------------------------- --------------------------------------------------------------- ------------ In silico Read Normalization --------------------- -- (Removing Excess Reads Beyond 200 Coverage -- --------------------------------------------------------------- # running normalization on reads: $VAR1 = [ [ '/home/ssm-user/reads.left.fq.gz' ], [ '/home/ssm-user/reads.right.fq.gz' ] ]; Tuesday, May 30, 2023: 06:07:22 CMD: /usr/local/bin/util/insilico_read_normalization.pl --seqType fq --JM 1G --max_cov 200 --min_cov 1 --CPU 4 --output /home/ssm-user/trinity_out_dir/insilico_read_normalization --max_CV 10000 --NO_SEQTK --left /home/ssm-user/reads.left.fq.gz --right /home/ssm-user/reads.right.fq.gz --pairs_together --PARALLEL_STATS -prepping seqs Converting input files. (both directions in parallel)CMD: /usr/local/bin/util/..//util/support_scripts//fastQ_to_fastA.pl -I <(gunzip -c /home/ssm-user/reads.left.fq.gz) >> left.fa 2> left.readcount CMD: /usr/local/bin/util/..//util/support_scripts//fastQ_to_fastA.pl -I <(gunzip -c /home/ssm-user/reads.right.fq.gz) >> right.fa 2> right.readcount CMD finished (1 seconds) CMD finished (1 seconds) CMD: touch left.fa.ok CMD finished (0 seconds) CMD: touch right.fa.ok CMD finished (0 seconds) Done converting input files.CMD: cat left.fa right.fa > both.fa CMD finished (0 seconds) CMD: touch both.fa.ok CMD finished (0 seconds) -kmer counting. ------------------------------------------- ----------- Jellyfish -------------------- -- (building a k-mer catalog from reads) -- ------------------------------------------- CMD: jellyfish count -t 4 -m 25 -s 100000000 --canonical both.fa CMD finished (1 seconds) CMD: jellyfish histo -t 4 -o jellyfish.K25.min2.kmers.fa.histo mer_counts.jf CMD finished (0 seconds) CMD: jellyfish dump -L 2 mer_counts.jf > jellyfish.K25.min2.kmers.fa CMD finished (0 seconds) CMD: touch jellyfish.K25.min2.kmers.fa.success CMD finished (0 seconds) -generating stats files CMD: /usr/local/bin/util/..//Inchworm/bin/fastaToKmerCoverageStats --reads left.fa --kmers jellyfish.K25.min2.kmers.fa --kmer_size 25 --num_threads 2 --DS > left.fa.K25.stats CMD: /usr/local/bin/util/..//Inchworm/bin/fastaToKmerCoverageStats --reads right.fa --kmers jellyfish.K25.min2.kmers.fa --kmer_size 25 --num_threads 2 --DS> right.fa.K25.stats -reading Kmer occurrences... -reading Kmer occurrences... done parsing 100964 Kmers, 100964 added, taking 0 seconds. done parsing 100964 Kmers, 100964 added, taking 0 seconds. STATS_GENERATION_TIME: 1 seconds. CMD finished (1 seconds) STATS_GENERATION_TIME: 1 seconds. CMD finished (1 seconds) CMD: touch left.fa.K25.stats.ok CMD finished (0 seconds) CMD: touch right.fa.K25.stats.ok CMD finished (0 seconds) -sorting each stats file by read name. CMD: head -n1 left.fa.K25.stats > left.fa.K25.stats.sort && tail -n +2 left.fa.K25.stats | /usr/bin/sort --parallel=4 -k1,1 -T . -S 1G >> left.fa.K25.stats.sort CMD: head -n1 right.fa.K25.stats > right.fa.K25.stats.sort && tail -n +2 right.fa.K25.stats | /usr/bin/sort --parallel=4 -k1,1 -T . -S 1G >> right.fa.K25.stats.sort CMD finished (0 seconds) CMD finished (0 seconds) CMD: touch left.fa.K25.stats.sort.ok CMD finished (0 seconds) CMD: touch right.fa.K25.stats.sort.ok CMD finished (0 seconds) -defining normalized reads CMD: /usr/local/bin/util/..//util/support_scripts//nbkc_merge_left_right_stats.pl --left left.fa.K25.stats.sort --right right.fa.K25.stats.sort --sorted > pairs.K25.stats -opening left.fa.K25.stats.sort -opening right.fa.K25.stats.sort -done opening files. CMD finished (0 seconds) CMD: touch pairs.K25.stats.ok CMD finished (0 seconds) CMD: /usr/local/bin/util/..//util/support_scripts//nbkc_normalize.pl --stats_file pairs.K25.stats --max_cov 200 --min_cov 1 --max_CV 10000 > pairs.K25.stats.C200.maxCV10000.accs 30472 / 30575 = 99.66% reads selected during normalization. 0 / 30575 = 0.00% reads discarded as likely aberrant based on coverage profiles. 0 / 30575 = 0.00% reads discarded as below minimum coverage threshold=1 CMD finished (0 seconds) CMD: touch pairs.K25.stats.C200.maxCV10000.accs.ok CMD finished (0 seconds) -search and capture. -preparing to extract selected reads from: /home/ssm-user/reads.right.fq.gz ... done prepping, now search and capture. -capturing normalized reads from: /home/ssm-user/reads.right.fq.gz -preparing to extract selected reads from: /home/ssm-user/reads.left.fq.gz ... done prepping, now search and capture. -capturing normalized reads from: /home/ssm-user/reads.left.fq.gz CMD: touch /home/ssm-user/trinity_out_dir/insilico_read_normalization/reads.left.fq.gz.normalized_K25_maxC200_minC1_maxCV10000.fq.ok CMD finished (0 seconds) CMD: touch /home/ssm-user/trinity_out_dir/insilico_read_normalization/reads.right.fq.gz.normalized_K25_maxC200_minC1_maxCV10000.fq.ok CMD finished (0 seconds) CMD: ln -sf /home/ssm-user/trinity_out_dir/insilico_read_normalization/reads.left.fq.gz.normalized_K25_maxC200_minC1_maxCV10000.fq left.norm.fq CMD finished (0 seconds) CMD: ln -sf /home/ssm-user/trinity_out_dir/insilico_read_normalization/reads.right.fq.gz.normalized_K25_maxC200_minC1_maxCV10000.fq right.norm.fq CMD finished (0 seconds) -removing tmp dir /home/ssm-user/trinity_out_dir/insilico_read_normalization/tmp_normalized_reads Normalization complete. See outputs: /home/ssm-user/trinity_out_dir/insilico_read_normalization/reads.left.fq.gz.normalized_K25_maxC200_minC1_maxCV10000.fq /home/ssm-user/trinity_out_dir/insilico_read_normalization/reads.right.fq.gz.normalized_K25_maxC200_minC1_maxCV10000.fq Tuesday, May 30, 2023: 06:07:26 CMD: touch /home/ssm-user/trinity_out_dir/insilico_read_normalization/normalization.ok Converting input files. (in parallel)Tuesday, May 30, 2023: 06:07:26 CMD: /usr/local/bin/util/support_scripts/fastQ_to_fastA.pl -I /home/ssm-user/trinity_out_dir/insilico_read_normalization/left.norm.fq >> left.fa 2> /home/ssm-user/trinity_out_dir/insilico_read_normalization/left.norm.fq.readcount Tuesday, May 30, 2023: 06:07:26 CMD: /usr/local/bin/util/support_scripts/fastQ_to_fastA.pl -I /home/ssm-user/trinity_out_dir/insilico_read_normalization/right.norm.fq >> right.fa 2> /home/ssm-user/trinity_out_dir/insilico_read_normalization/right.norm.fq.readcount Tuesday, May 30, 2023: 06:07:27 CMD: touch right.fa.ok Tuesday, May 30, 2023: 06:07:27 CMD: touch left.fa.ok Tuesday, May 30, 2023: 06:07:27 CMD: touch left.fa.ok right.fa.ok Tuesday, May 30, 2023: 06:07:27 CMD: cat left.fa right.fa > /home/ssm-user/trinity_out_dir/both.fa Tuesday, May 30, 2023: 06:07:27 CMD: touch /home/ssm-user/trinity_out_dir/both.fa.ok ------------------------------------------- ----------- Jellyfish -------------------- -- (building a k-mer (25) catalog from reads) -- ------------------------------------------- * [Tue May 30 06:07:27 2023] Running CMD: jellyfish count -t 4 -m 25 -s 100000000 -o mer_counts.25.asm.jf --canonical /home/ssm-user/trinity_out_dir/both.fa * [Tue May 30 06:07:28 2023] Running CMD: jellyfish dump -L 1 mer_counts.25.asm.jf > jellyfish.kmers.25.asm.fa * [Tue May 30 06:07:28 2023] Running CMD: jellyfish histo -t 4 -o jellyfish.kmers.25.asm.fa.histo mer_counts.25.asm.jf ---------------------------------------------- --------------- Inchworm (K=25, asm) --------------------- -- (Linear contig construction from k-mers) -- ---------------------------------------------- * [Tue May 30 06:07:28 2023] Running CMD: /usr/local/bin/Inchworm/bin//inchworm --kmers jellyfish.kmers.25.asm.fa --run_inchworm -K 25 --monitor 1 --DS --num_threads 4 --PARALLEL_IWORM --min_any_entropy 1.0 -L 25 --no_prune_error_kmers > /home/ssm-user/trinity_out_dir/inchworm.DS.fa.tmp Kmer length set to: 25 Min assembly length set to: 25 Monitor turned on, set to: 1 double stranded mode set min entropy set to: 1 setting number of threads to: 4 -setting parallel iworm mode. -reading Kmer occurrences... [0M] Kmers parsed. done parsing 517949 Kmers, 517949 added, taking 0 seconds. TIMING KMER_DB_BUILDING 0 s. Pruning kmers (min_kmer_count=1 min_any_entropy=1 min_ratio_non_error=0.005) Pruned 4114 kmers from catalog. Pruning time: 1 seconds = 0.0166667 minutes. TIMING PRUNING 1 s. -populating the kmer seed candidate list. Kcounter hash size: 517949 Processed 513835 non-zero abundance kmers in kcounter. -Not sorting list of kmers, given parallel mode in effect. -beginning inchworm contig assembly. Total kcounter hash size: 517949 vs. sorted list size: 513835 num threads set to: 4 Done opening file. tmp.iworm.fa.pid_3435.thread_0 Done opening file. tmp.iworm.fa.pid_3435.thread_1 Done opening file. tmp.iworm.fa.pid_3435.thread_2 Done opening file. tmp.iworm.fa.pid_3435.thread_3 Iworm contig assembly time: 0 seconds = 0 minutes. TIMING CONTIG_BUILDING 0 s. TIMING PROG_RUNTIME 1 s. * [Tue May 30 06:07:29 2023] Running CMD: mv /home/ssm-user/trinity_out_dir/inchworm.DS.fa.tmp /home/ssm-user/trinity_out_dir/inchworm.DS.fa Tuesday, May 30, 2023: 06:07:29 CMD: touch /home/ssm-user/trinity_out_dir/inchworm.DS.fa.finished -------------------------------------------------------- -------------------- Chrysalis ------------------------- -- (Contig Clustering & de Bruijn Graph Construction) -- -------------------------------------------------------- inchworm_target: /home/ssm-user/trinity_out_dir/both.fa bowtie_reads_fa: /home/ssm-user/trinity_out_dir/both.fa chrysalis_reads_fa: /home/ssm-user/trinity_out_dir/both.fa * [Tue May 30 06:07:29 2023] Running CMD: /usr/local/bin/util/support_scripts/filter_iworm_by_min_length_or_cov.pl /home/ssm-user/trinity_out_dir/inchworm.DS.fa 100 10 > /home/ssm-user/trinity_out_dir/chrysalis/inchworm.DS.fa.min100 * [Tue May 30 06:07:29 2023] Running CMD: /usr/local/bin/bowtie2-build --threads 4 -o 3 /home/ssm-user/trinity_out_dir/chrysalis/inchworm.DS.fa.min100 /home/ssm-user/trinity_out_dir/chrysalis/inchworm.DS.fa.min100 1>/dev/null * [Tue May 30 06:07:30 2023] Running CMD: bash -c " set -o pipefail;/usr/local/bin/bowtie2 --local -k 2 --no-unal --threads 4 -f --score-min G,20,8 -x /home/ssm-user/trinity_out_dir/chrysalis/inchworm.DS.fa.min100 /home/ssm-user/trinity_out_dir/both.fa | samtools view -@ 4 -F4 -Sb - | samtools sort -m 134217728 -@ 4 -no /home/ssm-user/trinity_out_dir/chrysalis/iworm.bowtie.nameSorted.bam" * [Tue May 30 06:07:32 2023] Running CMD: /usr/local/bin/util/support_scripts/scaffold_iworm_contigs.pl /home/ssm-user/trinity_out_dir/chrysalis/iworm.bowtie.nameSorted.bam /home/ssm-user/trinity_out_dir/chrysalis/inchworm.DS.fa.min100 > /home/ssm-user/trinity_out_dir/chrysalis/iworm_scaffolds.txt * [Tue May 30 06:07:32 2023] Running CMD: /usr/local/bin/Chrysalis/bin/GraphFromFasta -i /home/ssm-user/trinity_out_dir/chrysalis/inchworm.DS.fa.min100 -r /home/ssm-user/trinity_out_dir/both.fa -min_contig_length 200 -min_glue 2 -glue_factor 0.05 -min_iso_ratio 0.05 -t 4 -k 24 -kk 48 -scaffolding /home/ssm-user/trinity_out_dir/chrysalis/iworm_scaffolds.txt > /home/ssm-user/trinity_out_dir/chrysalis/iworm_cluster_welds_graph.txt * [Tue May 30 06:07:34 2023] Running CMD: /usr/bin/sort --parallel=4 -T . -S 1G -k9,9gr /home/ssm-user/trinity_out_dir/chrysalis/iworm_cluster_welds_graph.txt > /home/ssm-user/trinity_out_dir/chrysalis/iworm_cluster_welds_graph.txt.sorted * [Tue May 30 06:07:34 2023] Running CMD: /usr/local/bin/util/support_scripts/annotate_chrysalis_welds_with_iworm_names.pl /home/ssm-user/trinity_out_dir/chrysalis/inchworm.DS.fa.min100 /home/ssm-user/trinity_out_dir/chrysalis/iworm_cluster_welds_graph.txt.sorted > /home/ssm-user/trinity_out_dir/chrysalis/iworm_cluster_welds_graph.txt.sorted.wIwormNames * [Tue May 30 06:07:34 2023] Running CMD: /usr/local/bin/Chrysalis/bin/BubbleUpClustering -i /home/ssm-user/trinity_out_dir/chrysalis/inchworm.DS.fa.min100 -weld_graph /home/ssm-user/trinity_out_dir/chrysalis/iworm_cluster_welds_graph.txt.sorted -min_contig_length 200 -max_cluster_size 25 > /home/ssm-user/trinity_out_dir/chrysalis/GraphFromIwormFasta.out * [Tue May 30 06:07:34 2023] Running CMD: /usr/local/bin/Chrysalis/bin/CreateIwormFastaBundle -i /home/ssm-user/trinity_out_dir/chrysalis/GraphFromIwormFasta.out -o /home/ssm-user/trinity_out_dir/chrysalis/bundled_iworm_contigs.fasta -min 200 * [Tue May 30 06:07:34 2023] Running CMD: /usr/local/bin/Chrysalis/bin/ReadsToTranscripts -i /home/ssm-user/trinity_out_dir/both.fa -f /home/ssm-user/trinity_out_dir/chrysalis/bundled_iworm_contigs.fasta -o /home/ssm-user/trinity_out_dir/chrysalis/readsToComponents.out -t 4 -max_mem_reads 50000000 -p 10 * [Tue May 30 06:07:37 2023] Running CMD: /usr/bin/sort --parallel=4 -T . -S 1G -k 1,1n -k3,3nr -k2,2 /home/ssm-user/trinity_out_dir/chrysalis/readsToComponents.out > /home/ssm-user/trinity_out_dir/chrysalis/readsToComponents.out.sort Tuesday, May 30, 2023: 06:07:37 CMD: mkdir -p /home/ssm-user/trinity_out_dir/read_partitions/Fb_0/CBin_0 Tuesday, May 30, 2023: 06:07:37 CMD: touch /home/ssm-user/trinity_out_dir/partitioned_reads.files.list.ok Tuesday, May 30, 2023: 06:07:37 CMD: /usr/local/bin/util/support_scripts/write_partitioned_trinity_cmds.pl --reads_list_file /home/ssm-user/trinity_out_dir/partitioned_reads.files.list --CPU 1 --max_memory 1G --run_as_paired --seqType fa --trinity_complete --full_cleanup --NO_SEQTK --no_salmon > recursive_trinity.cmds Tuesday, May 30, 2023: 06:07:37 CMD: touch recursive_trinity.cmds.ok Tuesday, May 30, 2023: 06:07:37 CMD: touch recursive_trinity.cmds.ok -------------------------------------------------------------------------------- ------------ Trinity Phase 2: Assembling Clusters of Reads --------------------- ------- (involving the Inchworm, Chrysalis, Butterfly trifecta ) --------------- -------------------------------------------------------------------------------- Tuesday, May 30, 2023: 06:07:37 CMD: /usr/local/bin/trinity-plugins/BIN/ParaFly -c recursive_trinity.cmds -CPU 4 -v -shuffle Number of Commands: 38 succeeded(38) 100% completed. All commands completed successfully. :-) ** Harvesting all assembled transcripts into a single multi-fasta file... Tuesday, May 30, 2023: 06:08:29 CMD: find /home/ssm-user/trinity_out_dir/read_partitions/ -name '*inity.fasta' | /usr/local/bin/util/support_scripts/partitioned_trinity_aggregator.pl --token_prefix TRINITY_DN --output_prefix /home/ssm-user/trinity_out_dir/Trinity.tmp * [Tue May 30 06:08:29 2023] Running CMD: /usr/local/bin/util/support_scripts/salmon_runner.pl Trinity.tmp.fasta /home/ssm-user/trinity_out_dir/both.fa 4 * [Tue May 30 06:08:31 2023] Running CMD: /usr/local/bin/util/support_scripts/filter_transcripts_require_min_cov.pl Trinity.tmp.fasta /home/ssm-user/trinity_out_dir/both.fa salmon_outdir/quant.sf 2 > /home/ssm-user/trinity_out_dir.Trinity.fasta ############################################################################# Finished. Final Trinity assemblies are written to /home/ssm-user/trinity_out_dir.Trinity.fasta ############################################################################# Tuesday, May 30, 2023: 06:08:31 CMD: /usr/local/bin/util/support_scripts/get_Trinity_gene_to_trans_map.pl /home/ssm-user/trinity_out_dir.Trinity.fasta > /home/ssm-user/trinity_out_dir.Trinity.fasta.gene_trans_map
今回の環境で、サンプルファイルにてTrinityを実行した場合に必要な時間は以下の通りでした。
項目 | 時間 |
---|---|
コンテナのビルド | 4分 |
SIF形式のコンテナイメージ作成 | 14分 |
Trinity実行 | 2分 |
SIF形式のコンテナイメージは、キャッシュされますので次回以降の実行時間は短縮されます。
エラー対応
今回の検証で、Trinity実行時にいくつからエラー対応が必要となったため、参考までに残しておきます。
ストレージの空き容量不足
no space left on device
のエラーが出た場合は、ストレージの空き容量不足となります。
今回使用したUbuntuのAMIは、ストレージのデフォルトが8GiBなのですが、Docker HubのTrinityのイメージは3.79GBあるので、多少余裕を持ったストレージ容量が必要となります。
FATAL: Unable to handle docker://trinityrnaseq/trinityrnaseq:latest uri: while building SIF from layers: conveyor failed to get: writing blob: write /tmp/bundle-temp-1656152970/oci-put-blob1557947261: no space left on device
サンプルファイルが正しくない
no reads made it to the normalization process
のエラーが出た場合は、ダウンロードしてきたサンプルファイルに問題がある可能性があります。
今回の検証では、wgetで指定したGitHubのURLが間違っていたために、こちらのエラーが発生しました。
Error, no reads made it to the normalization process... at /usr/local/bin/util/..//util/support_scripts//nbkc_normalize.pl line 119. Error, cmd: /usr/local/bin/util/..//util/support_scripts//nbkc_normalize.pl --stats_file pairs.K25.stats --max_cov 200 --min_cov 1 --max_CV 10000 > pairs.K25.stats.C200.maxCV10000.accs died with ret 65280 at /usr/local/bin/util/insilico_read_normalization.pl line 807. Error, cmd: /usr/local/bin/util/insilico_read_normalization.pl --seqType fq --JM 1G --max_cov 200 --min_cov 1 --CPU 4 --output /home/ssm-user/tmp/trinity_out_dir/insilico_read_normalization --max_CV 10000 --NO_SEQTK --left /home/ssm-user/tmp/reads.left.fq.gz --right /home/ssm-user/tmp/reads.right.fq.gz --pairs_together --PARALLEL_STATS died with ret 512 at /usr/local/bin/Trinity line 2919. main::process_cmd("/usr/local/bin/util/insilico_read_normalization.pl --seqType "...) called at /usr/local/bin/Trinity line 3472 main::normalize("/home/ssm-user/tmp/trinity_out_dir/insilico_read_normalization", 200, ARRAY(0x557ecfaa43b0), ARRAY(0x557ecfaa43e0)) called at /usr/local/bin/Trinity line 3412 main::run_normalization(200, ARRAY(0x557ecfaa43b0), ARRAY(0x557ecfaa43e0)) called at /usr/local/bin/Trinity line 1450
参考
まとめ
今回、Apptainerをインストールした単一のEC2上でTrinityを実行しました。
インターネット上に知見があまり見つからない中でのエラー対応は時間がかかりましたが、ちょっとずつ知見が貯まってきたとお思います。
次回は、AWS ParallelClusterでの検証をブログにしていきます!
この記事が、どなたかのお役に立てば幸いです。それでは!